Overview

Dataset statistics

Number of variables10
Number of observations58000
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory4.4 MiB
Average record size in memory80.0 B

Variable types

NUM10

Reproduction

Analysis started2020-08-25 01:51:36.003331
Analysis finished2020-08-25 01:51:53.993236
Duration17.99 seconds
Versionpandas-profiling v2.8.0
Command linepandas_profiling --config_file config.yaml [YOUR_FILE.csv]
Download configurationconfig.yaml

Warnings

A8 is highly correlated with A5High correlation
A5 is highly correlated with A8High correlation
A4 is highly skewed (γ1 = 31.68785333) Skewed
A6 is highly skewed (γ1 = -21.86181042) Skewed
A2 has 35878 (61.9%) zeros Zeros
A4 has 38055 (65.6%) zeros Zeros
A5 has 706 (1.2%) zeros Zeros
A6 has 18420 (31.8%) zeros Zeros
A9 has 20134 (34.7%) zeros Zeros

Variables

A1
Real number (ℝ≥0)

Distinct count76
Unique (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean48.23829310344828
Minimum27.0
Maximum126.0
Zeros0
Zeros (%)0.0%
Memory size453.2 KiB
2020-08-25T01:51:54.037053image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Quantile statistics

Minimum27
5-th percentile37
Q138
median45
Q355
95-th percentile79
Maximum126
Range99
Interquartile range (IQR)17

Descriptive statistics

Standard deviation12.23808169
Coefficient of variation (CV)0.2537005541
Kurtosis6.50879166
Mean48.2382931
Median Absolute Deviation (MAD)8
Skewness2.180838482
Sum2797821
Variance149.7706435
2020-08-25T01:51:54.136612image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
371330822.9%
 
55594910.3%
 
5656589.8%
 
4132995.7%
 
4532515.6%
 
4425964.5%
 
4924394.2%
 
4324384.2%
 
4623024.0%
 
5118023.1%
 
5317002.9%
 
4213792.4%
 
3812852.2%
 
4711752.0%
 
4811502.0%
 
399231.6%
 
508401.4%
 
527821.3%
 
406221.1%
 
544240.7%
 
813820.7%
 
583400.6%
 
823260.6%
 
573190.5%
 
793160.5%
 
Other values (51)29955.2%
 
ValueCountFrequency (%) 
273< 0.1%
 
36610.1%
 
371330822.9%
 
3812852.2%
 
399231.6%
 
406221.1%
 
4132995.7%
 
4213792.4%
 
4324384.2%
 
4425964.5%
 
ValueCountFrequency (%) 
1261< 0.1%
 
1238< 0.1%
 
1211< 0.1%
 
1202< 0.1%
 
1163< 0.1%
 
1141< 0.1%
 
1111< 0.1%
 
10828< 0.1%
 
107720.1%
 
106870.1%
 

A2
Real number (ℝ)

ZEROS

Distinct count206
Unique (%)0.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean-0.019448275862068966
Minimum-4821.0
Maximum5075.0
Zeros35878
Zeros (%)61.9%
Memory size453.2 KiB
2020-08-25T01:51:54.434659image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Quantile statistics

Minimum-4821
5-th percentile-4
Q10
median0
Q30
95-th percentile4
Maximum5075
Range9896
Interquartile range (IQR)0

Descriptive statistics

Standard deviation77.95803508
Coefficient of variation (CV)-4008.480527
Kurtosis2647.317448
Mean-0.01944827586
Median Absolute Deviation (MAD)0
Skewness6.438145983
Sum-1128
Variance6077.455233
2020-08-25T01:51:54.533296image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
03587861.9%
 
-138286.6%
 
137136.4%
 
222253.8%
 
-222103.8%
 
320233.5%
 
416922.9%
 
515562.7%
 
-315202.6%
 
-413862.4%
 
-512472.1%
 
6590.1%
 
-6540.1%
 
-8320.1%
 
724< 0.1%
 
-722< 0.1%
 
819< 0.1%
 
-1116< 0.1%
 
916< 0.1%
 
-1015< 0.1%
 
-914< 0.1%
 
-1311< 0.1%
 
-4211< 0.1%
 
-2711< 0.1%
 
-3310< 0.1%
 
Other values (181)4080.7%
 
ValueCountFrequency (%) 
-48211< 0.1%
 
-46241< 0.1%
 
-44751< 0.1%
 
-41841< 0.1%
 
-40481< 0.1%
 
-37001< 0.1%
 
-31611< 0.1%
 
-25441< 0.1%
 
-24601< 0.1%
 
-18651< 0.1%
 
ValueCountFrequency (%) 
50751< 0.1%
 
49031< 0.1%
 
46921< 0.1%
 
45011< 0.1%
 
44001< 0.1%
 
42541< 0.1%
 
34471< 0.1%
 
33281< 0.1%
 
30491< 0.1%
 
25611< 0.1%
 

A3
Real number (ℝ≥0)

Distinct count51
Unique (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean85.34912068965517
Minimum21.0
Maximum149.0
Zeros0
Zeros (%)0.0%
Memory size453.2 KiB
2020-08-25T01:51:54.642430image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Quantile statistics

Minimum21
5-th percentile76
Q179
median83
Q389
95-th percentile106
Maximum149
Range128
Interquartile range (IQR)10

Descriptive statistics

Standard deviation8.902768762
Coefficient of variation (CV)0.1043100232
Kurtosis0.543597328
Mean85.34912069
Median Absolute Deviation (MAD)5
Skewness1.096128712
Sum4950249
Variance79.25929163
2020-08-25T01:51:54.747890image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
77610010.5%
 
8148608.4%
 
7946318.0%
 
8643897.6%
 
7639356.8%
 
8334025.9%
 
7828985.0%
 
8423924.1%
 
8023814.1%
 
8823194.0%
 
8221513.7%
 
9717203.0%
 
9516952.9%
 
8715412.7%
 
9612312.1%
 
8512302.1%
 
10611532.0%
 
759611.7%
 
938061.4%
 
907901.4%
 
927581.3%
 
1087491.3%
 
986351.1%
 
896251.1%
 
1046231.1%
 
Other values (26)40256.9%
 
ValueCountFrequency (%) 
211< 0.1%
 
292< 0.1%
 
401< 0.1%
 
441< 0.1%
 
642< 0.1%
 
7115< 0.1%
 
7226< 0.1%
 
7314< 0.1%
 
741830.3%
 
759611.7%
 
ValueCountFrequency (%) 
1491< 0.1%
 
1411< 0.1%
 
1181< 0.1%
 
113600.1%
 
112490.1%
 
1111360.2%
 
110940.2%
 
1094050.7%
 
1087491.3%
 
1075581.0%
 

A4
Real number (ℝ)

SKEWED
ZEROS

Distinct count137
Unique (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.25967241379310346
Minimum-3939.0
Maximum3830.0
Zeros38055
Zeros (%)65.6%
Memory size453.2 KiB
2020-08-25T01:51:54.863794image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Quantile statistics

Minimum-3939
5-th percentile-4
Q10
median0
Q30
95-th percentile5
Maximum3830
Range7769
Interquartile range (IQR)0

Descriptive statistics

Standard deviation36.5215156
Coefficient of variation (CV)140.6445724
Kurtosis7698.224846
Mean0.2596724138
Median Absolute Deviation (MAD)0
Skewness31.68785333
Sum15061
Variance1333.821102
2020-08-25T01:51:54.971616image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
03805565.6%
 
-128865.0%
 
122553.9%
 
-220593.5%
 
216722.9%
 
-315932.7%
 
312772.2%
 
-410861.9%
 
410381.8%
 
59641.7%
 
68901.5%
 
-68851.5%
 
-58451.5%
 
-77921.4%
 
87621.3%
 
75230.9%
 
-8580.1%
 
926< 0.1%
 
-918< 0.1%
 
-1017< 0.1%
 
-1216< 0.1%
 
-1116< 0.1%
 
1015< 0.1%
 
1313< 0.1%
 
-1313< 0.1%
 
Other values (112)2260.4%
 
ValueCountFrequency (%) 
-39391< 0.1%
 
-20441< 0.1%
 
-11081< 0.1%
 
-6741< 0.1%
 
-5871< 0.1%
 
-4951< 0.1%
 
-4781< 0.1%
 
-3621< 0.1%
 
-3181< 0.1%
 
-2731< 0.1%
 
ValueCountFrequency (%) 
38301< 0.1%
 
37431< 0.1%
 
26741< 0.1%
 
25651< 0.1%
 
20061< 0.1%
 
17511< 0.1%
 
11671< 0.1%
 
7691< 0.1%
 
7371< 0.1%
 
6921< 0.1%
 

A5
Real number (ℝ)

HIGH CORRELATION
ZEROS

Distinct count54
Unique (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean34.54986206896552
Minimum-188.0
Maximum436.0
Zeros706
Zeros (%)1.2%
Memory size453.2 KiB
2020-08-25T01:51:55.092707image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Quantile statistics

Minimum-188
5-th percentile-10
Q126
median42
Q346
95-th percentile56
Maximum436
Range624
Interquartile range (IQR)20

Descriptive statistics

Standard deviation21.66013857
Coefficient of variation (CV)0.6269240243
Kurtosis8.547202399
Mean34.54986207
Median Absolute Deviation (MAD)8
Skewness-1.162427473
Sum2003892
Variance469.1616028
2020-08-25T01:51:55.200475image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
46670511.6%
 
42624010.8%
 
44594710.3%
 
3845677.9%
 
5041627.2%
 
5432055.5%
 
5230845.3%
 
3622573.9%
 
5617333.0%
 
3415722.7%
 
2814182.4%
 
2612222.1%
 
2411822.0%
 
2011772.0%
 
3010331.8%
 
189341.6%
 
168181.4%
 
88151.4%
 
108091.4%
 
127621.3%
 
-27101.2%
 
67091.2%
 
07061.2%
 
-46911.2%
 
705330.9%
 
Other values (29)50098.6%
 
ValueCountFrequency (%) 
-1884< 0.1%
 
-1601< 0.1%
 
-1002< 0.1%
 
-46660.1%
 
-423260.6%
 
-404640.8%
 
-38710.1%
 
-36670.1%
 
-32680.1%
 
-30780.1%
 
ValueCountFrequency (%) 
4362< 0.1%
 
3362< 0.1%
 
3101< 0.1%
 
981< 0.1%
 
723570.6%
 
705330.9%
 
68890.2%
 
64960.2%
 
621540.3%
 
602230.4%
 

A6
Real number (ℝ)

SKEWED
ZEROS

Distinct count299
Unique (%)0.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.6081896551724137
Minimum-26739.0
Maximum15164.0
Zeros18420
Zeros (%)31.8%
Memory size453.2 KiB
2020-08-25T01:51:55.308293image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Quantile statistics

Minimum-26739
5-th percentile-23
Q1-5
median0
Q35
95-th percentile24
Maximum15164
Range41903
Interquartile range (IQR)10

Descriptive statistics

Standard deviation217.5976752
Coefficient of variation (CV)135.3059787
Kurtosis5979.233609
Mean1.608189655
Median Absolute Deviation (MAD)5
Skewness-21.86181042
Sum93275
Variance47348.74824
2020-08-25T01:51:55.397712image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
01842031.8%
 
-113332.3%
 
113162.3%
 
-412362.1%
 
-311932.1%
 
-211612.0%
 
-611221.9%
 
610921.9%
 
310781.9%
 
510771.9%
 
210731.8%
 
410261.8%
 
-510011.7%
 
78771.5%
 
-78401.4%
 
88091.4%
 
97941.4%
 
-87831.4%
 
117301.3%
 
107281.3%
 
-97031.2%
 
-106941.2%
 
-116911.2%
 
126811.2%
 
146721.2%
 
Other values (274)1687029.1%
 
ValueCountFrequency (%) 
-267391< 0.1%
 
-138391< 0.1%
 
-128091< 0.1%
 
-110421< 0.1%
 
-104531< 0.1%
 
-83921< 0.1%
 
-41411< 0.1%
 
-29441< 0.1%
 
-23851< 0.1%
 
-23771< 0.1%
 
ValueCountFrequency (%) 
151641< 0.1%
 
131481< 0.1%
 
125881< 0.1%
 
121691< 0.1%
 
117491< 0.1%
 
99311< 0.1%
 
80981< 0.1%
 
79731< 0.1%
 
63391< 0.1%
 
49101< 0.1%
 

A7
Real number (ℝ)

Distinct count86
Unique (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean37.09231034482759
Minimum-48.0
Maximum105.0
Zeros0
Zeros (%)0.0%
Memory size453.2 KiB
2020-08-25T01:51:55.501376image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Quantile statistics

Minimum-48
5-th percentile4
Q132
median39
Q342
95-th percentile61
Maximum105
Range153
Interquartile range (IQR)10

Descriptive statistics

Standard deviation13.11142808
Coefficient of variation (CV)0.3534810303
Kurtosis1.62177146
Mean37.09231034
Median Absolute Deviation (MAD)5
Skewness-0.3684014245
Sum2151354
Variance171.9095462
2020-08-25T01:51:55.619807image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
4048308.3%
 
4141457.1%
 
4234165.9%
 
3932505.6%
 
3827914.8%
 
4326534.6%
 
3725224.3%
 
3520673.6%
 
3620533.5%
 
3316872.9%
 
3415452.7%
 
3213722.4%
 
3113402.3%
 
2213282.3%
 
413212.3%
 
4412022.1%
 
4511952.1%
 
2311322.0%
 
3010861.9%
 
2510641.8%
 
469961.7%
 
289601.7%
 
219281.6%
 
298821.5%
 
268701.5%
 
Other values (61)1136519.6%
 
ValueCountFrequency (%) 
-481< 0.1%
 
-432< 0.1%
 
-271< 0.1%
 
-262< 0.1%
 
-191< 0.1%
 
-188< 0.1%
 
-161< 0.1%
 
-151< 0.1%
 
-103< 0.1%
 
-81< 0.1%
 
ValueCountFrequency (%) 
1051< 0.1%
 
1041< 0.1%
 
752< 0.1%
 
7310< 0.1%
 
72500.1%
 
711490.3%
 
70860.1%
 
695180.9%
 
684280.7%
 
674770.8%
 

A8
Real number (ℝ)

HIGH CORRELATION

Distinct count123
Unique (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean50.88455172413793
Minimum-353.0
Maximum270.0
Zeros0
Zeros (%)0.0%
Memory size453.2 KiB
2020-08-25T01:51:55.737799image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Quantile statistics

Minimum-353
5-th percentile29
Q137
median44
Q360
95-th percentile94
Maximum270
Range623
Interquartile range (IQR)23

Descriptive statistics

Standard deviation21.41805059
Coefficient of variation (CV)0.4209145972
Kurtosis8.41627801
Mean50.88455172
Median Absolute Deviation (MAD)10
Skewness1.06620442
Sum2951304
Variance458.7328912
2020-08-25T01:51:55.840006image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
3727724.8%
 
3927034.7%
 
4126004.5%
 
3524084.2%
 
4419933.4%
 
4219883.4%
 
3219013.3%
 
3416892.9%
 
4616402.8%
 
4315252.6%
 
4015002.6%
 
3014462.5%
 
3813432.3%
 
3613372.3%
 
4813202.3%
 
3312542.2%
 
4510711.8%
 
499611.7%
 
509411.6%
 
558531.5%
 
518141.4%
 
287981.4%
 
297861.4%
 
317731.3%
 
577441.3%
 
Other values (98)2084035.9%
 
ValueCountFrequency (%) 
-3532< 0.1%
 
-2582< 0.1%
 
-1911< 0.1%
 
-141< 0.1%
 
41< 0.1%
 
161< 0.1%
 
201< 0.1%
 
218< 0.1%
 
22490.1%
 
232090.4%
 
ValueCountFrequency (%) 
2701< 0.1%
 
2691< 0.1%
 
2652< 0.1%
 
2401< 0.1%
 
1842< 0.1%
 
1313< 0.1%
 
130450.1%
 
12928< 0.1%
 
1281570.3%
 
127650.1%
 

A9
Real number (ℝ)

ZEROS

Distinct count77
Unique (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean13.932413793103448
Minimum-356.0
Maximum266.0
Zeros20134
Zeros (%)34.7%
Memory size453.2 KiB
2020-08-25T01:51:55.951624image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Quantile statistics

Minimum-356
5-th percentile0
Q10
median2
Q314
95-th percentile78
Maximum266
Range622
Interquartile range (IQR)14

Descriptive statistics

Standard deviation25.61401796
Coefficient of variation (CV)1.838447978
Kurtosis8.438483988
Mean13.93241379
Median Absolute Deviation (MAD)2
Skewness2.243243705
Sum808080
Variance656.0779162
2020-08-25T01:51:56.051846image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
02013434.7%
 
21475825.4%
 
617753.1%
 
817313.0%
 
416942.9%
 
1415402.7%
 
1212552.2%
 
1611201.9%
 
108811.5%
 
328191.4%
 
347301.3%
 
187071.2%
 
226351.1%
 
245951.0%
 
305921.0%
 
405551.0%
 
264730.8%
 
284690.8%
 
364580.8%
 
424150.7%
 
563190.5%
 
203180.5%
 
462940.5%
 
582890.5%
 
602780.5%
 
Other values (52)51668.9%
 
ValueCountFrequency (%) 
-3562< 0.1%
 
-2982< 0.1%
 
-2641< 0.1%
 
-181< 0.1%
 
-143< 0.1%
 
-122< 0.1%
 
-21< 0.1%
 
02013434.7%
 
21475825.4%
 
416942.9%
 
ValueCountFrequency (%) 
2661< 0.1%
 
2441< 0.1%
 
2421< 0.1%
 
2261< 0.1%
 
1961< 0.1%
 
1802< 0.1%
 
126560.1%
 
1241820.3%
 
1221760.3%
 
1202120.4%
 

target
Real number (ℝ≥0)

Distinct count7
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.6947758620689655
Minimum1
Maximum7
Zeros0
Zeros (%)0.0%
Memory size453.2 KiB
2020-08-25T01:51:56.158648image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median1
Q31
95-th percentile5
Maximum7
Range6
Interquartile range (IQR)0

Descriptive statistics

Standard deviation1.350960336
Coefficient of variation (CV)0.7971321558
Kurtosis0.448393212
Mean1.694775862
Median Absolute Deviation (MAD)0
Skewness1.502523229
Sum98297
Variance1.825093831
2020-08-25T01:51:56.261378image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
14558678.6%
 
4890315.3%
 
532675.6%
 
31710.3%
 
2500.1%
 
713< 0.1%
 
610< 0.1%
 
ValueCountFrequency (%) 
14558678.6%
 
2500.1%
 
31710.3%
 
4890315.3%
 
532675.6%
 
610< 0.1%
 
713< 0.1%
 
ValueCountFrequency (%) 
713< 0.1%
 
610< 0.1%
 
532675.6%
 
4890315.3%
 
31710.3%
 
2500.1%
 
14558678.6%
 

Interactions

2020-08-25T01:51:37.443975image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:51:37.586247image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:51:37.729835image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:51:37.891308image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:51:38.045269image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:51:38.190938image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:51:38.329413image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:51:38.484746image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:51:38.630492image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:51:38.775623image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:51:39.116542image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:51:39.259719image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:51:39.401599image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:51:39.546671image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:51:39.703831image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:51:39.854208image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:51:39.990596image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:51:40.144602image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:51:40.299506image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:51:40.447912image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:51:40.599041image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:51:40.746881image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:51:40.899057image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:51:41.054738image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:51:41.225424image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:51:41.388485image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:51:41.537435image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:51:41.698172image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:51:41.858490image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:51:42.012355image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:51:42.169193image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:51:42.329024image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:51:42.491508image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:51:42.658211image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:51:42.838909image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:51:43.012696image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:51:43.171016image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:51:43.345995image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:51:43.513720image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:51:43.677279image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:51:43.846423image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:51:44.003368image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:51:44.338108image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:51:44.495461image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:51:44.656594image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:51:44.807836image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:51:44.955701image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:51:45.114273image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:51:45.271211image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:51:45.429715image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:51:45.585480image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:51:45.717062image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:51:45.852036image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:51:45.992535image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:51:46.144649image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:51:46.286113image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:51:46.417424image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:51:46.563328image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:51:46.703810image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:51:46.840985image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:51:46.983281image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:51:47.134239image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:51:47.288098image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:51:47.445051image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:51:47.614181image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:51:47.773662image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:51:47.931462image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:51:48.098958image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:51:48.266237image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:51:48.426526image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:51:48.590336image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:51:48.742438image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:51:48.893296image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:51:49.053351image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:51:49.403412image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:51:49.561646image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:51:49.712943image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:51:49.873036image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:51:50.030949image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:51:50.185424image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:51:50.341252image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:51:50.485646image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:51:50.634352image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:51:50.780791image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:51:50.936734image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:51:51.090029image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:51:51.230196image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:51:51.385327image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:51:51.544207image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:51:51.689965image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:51:51.840168image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:51:51.988771image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:51:52.141096image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:51:52.294730image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:51:52.458519image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:51:52.617400image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:51:52.764299image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:51:52.923556image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:51:53.084705image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:51:53.239136image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Correlations

2020-08-25T01:51:56.382710image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2020-08-25T01:51:56.609255image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2020-08-25T01:51:56.880837image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2020-08-25T01:51:57.103460image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

2020-08-25T01:51:53.512464image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:51:53.795230image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Sample

First rows

A1A2A3A4A5A6A7A8A9target
050.021.077.00.028.00.027.048.022.02
155.00.092.00.00.026.036.092.056.04
253.00.082.00.052.0-5.029.030.02.01
337.00.076.00.028.018.040.048.08.01
437.00.079.00.034.0-26.043.046.02.01
585.00.088.0-4.06.01.03.083.080.05
656.00.081.00.0-4.011.025.086.062.04
755.0-1.095.0-3.054.0-4.040.041.02.01
853.08.077.00.028.00.023.048.024.04
937.00.0101.0-7.028.00.064.073.08.01

Last rows

A1A2A3A4A5A6A7A8A9target
5799038.02.079.00.038.018.042.041.00.01
57991101.00.0102.00.070.0-3.01.033.032.05
5799239.0-2.080.0-4.038.00.041.041.00.01
5799343.00.081.01.042.0-9.037.039.02.01
5799449.00.087.00.046.0-12.038.041.02.01
5799580.00.084.00.0-36.0-29.04.0120.0116.05
5799655.00.081.00.0-20.025.026.0102.076.04
5799755.00.077.00.012.0-22.022.065.042.04
5799837.00.0103.00.018.0-16.066.085.020.01
5799956.02.098.00.052.01.042.046.04.04